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Abstract. 

We propose a new physically-based "multifractal stress activation" model of earthquake interaction 
and triggering based on two simple ingredients: (i) a seismic rupture results from activated processes 
giving an exponential dependence on the local stress; (ii) the stress relaxation has a long memory. 
The combination of these two effects predicts in a rather general way that seismic decay rates after 
mainshocks follow the Omori law 1 /V with exponents p linearly increasing with the magnitude Ml 
of the mainshock and the inverse temperature. We carefully test the prediction on the magnitude 
dependence of p by a detailed analysis of earthquake sequences in the Southern California Earthquake 
catalog. We find power law relaxations of seismic sequences triggered by mainshocks with exponents 
p increasing with the mainshock magnitude by approximately 0.1 — 0.15 for each magnitude unit 
increase, from p{Ml = 3) « 0.6 to p{Ml = 7) « 1.1, in good agreement with the prediction of the 
multifractal model. The results are robust with respect to different time intervals, magnitude ranges 
and declustering methods. When applied to synthetic catalogs generated by the ETAS (Epidemic- Type 
Aftershock Sequence) model constituting a strong null hypothesis with built-in magnitude-independent 
p-values, our procedure recovers the correct magnitude-independent p-values. Our analysis thus suggests 
that a new important fact of seismicity has been unearthed. We discuss alternative interpretations of 
the data and describe other predictions of the model. 



3 



1. Introduction 

There are now many evidences that the space-time organization of earthquakes is consistent with 
the idea that a single physical triggering mechanism is responsible for the occurrence of aftershocks, 
mainshocks, foreshocks, and multiplets, leading to the more encompassing concept of earthquake 
triggering (see for instance [Lin and Stein, 2004; Feltzer et al., 2004; Murru et al., 2004; Hue and Main, 
2003; Marsan, 2003; Helmstetter and Sornette, 2003; Papazachos et al, 2000]). 

Earthquake triggering has been modeled using a variety of approaches including, static stress 
transfer calculations [King et al, 1994; Stein, 2003], dynamic triggering processes [Voisin, 2002; 
Perfettini et al, 2003], as well as more phenomenological epidemic- type aftershock sequence (ETAS) 
models [Kagan and Knopoff, 1981; 1987; Ogata, 1988] which are based on "mutually self-excited point 
processes" introduced by Hawkes [1971] (see [Helmstetter and Sornette, 2002a] for properties of the 
ETAS model and a review of the literature). The later modeling approach in particular has been able 
to rationalize most of the phenomenological statistical properties of earthquake catalogs, such as the 
larger proportion than normal of large versus small foreshocks, the power law acceleration of stacked 
seismicity rate as a function of time to the mainshock, the spatial migration of foreshocks toward the 
mainshock when averaging over many sequences, the independence of foreshocks precursory properties 
as a function of the mainshock size, the existence of correlations in seismicity over surprisingly large 
length scales [Helmstetter and Sornette, 2003a], Bath's law [Helmstetter and Sornette, 2003b] and so on. 

Its appeals due the simplicity of its premise, its power of explanation of a large set of empirical 
observations and the relative ease with which it can be implemented for rigorous statistical tests has 
made the class of ETAS models the natural null hypothesis against which to test any other model of 
seismicity. The class of ETAS models suffers however from a major drawback, that is, the lack of a 
clear physical basis. The ETAS model is a statistical phenomenological construction which postulates 
a dependence of the present seismic rate on past seismic rates propagated forward in time via a bare 
Omori propagator. 

One would thus like models in which seismic rates derive from the interaction between and transfer 



4 



of physical fields, such as stress, strain rates and fluid flows. For this purpose, we present here a 
generalization which has a sound physical basis, as it relies on two fundamental physical ingredients: 
rupture activation and stress transfer. Rupture activation is described by the generic thermal activation 
processes at the origin of the state-and-velocity dependent friction laws, stress corrosion effects, and 
so on. In the description of stress transfer, we take into account the long-time memory effects in the 
relaxation of stress flelds due to the visco-elasto-plastic rheology of the crust and upper mantle. Using 
this model, we predict a new phenomenon, the dependence of the exponent p of the Omori law on 
the mainshock magnitude. This prediction is tested successfully by a careful analysis of the Southern 
California earthquake catalog. For this and for other reasons that will become clear below, we coin our 
model the "multifractal stress activation" model. 

The organization of this paper is the following. In section 2, we first discuss the fundamental 
physical ingredients of the "multifractal stress activation" model. Section 3 gives its detailed definition. 
Section 4 shows by analytical calculations that the Omori law ~ l/V^ derives naturally from the 
model, but with an exponent p which is an increasing linear function of the mainshock magnitude. 
This surprising prediction results from the interplay between the exponential activation process and 
the long-time memory of stress relaxation processes. Section 5 tests this prediction using different 
dcclustcring techniques to identify mainshocks at magnitudes ranging from 1.5 to 7.5 in the Southern 
California catalog and to stack their corresponding aftershock sequences. We find a remarkable 
agreement with p{M) increasing from approximately p = 0.6 for M = 3 to p = 1.1 for M = 7. We 
also present tests on synthetic catalogs generated with the ETAS model. Section 6 presents other 
observations that can be reinterpreted within the framework of our model and discuss other predictions. 
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2. Fundamentals of the "multifractal stress activation" model 

2.1. Earthquakes as thermally activated processes 

We model seismic activity as the occurence of frictional sliding events and/or fault ruptures that 
are thermally activated processes facilitated by the applied stress field. The relevance of thermal 
activation is made clear when examining the underlying physical processes of the various proposed 
models of earthquakes, which we now briefly review. 

• Thermal activation is known to control creep rupture (also called static fatigue) (see for instance 
the review in [Scholz, 2002] for the application to rock mechanics and to earthquakes and [Ciliberto 
et al., 2001; Politi et al, 2002; Saichev and Somette, 2003] for recent experimental and theoretical 
developments). 

• The use of the Eyring rheology and other thermally-dependent friction laws are of standard use 
for describing creep failure in a variety of compounds [Liu and Ross, 1996] as well as material 
interfaces [Vulliet, 2000]. These laws consist in adapting, at the microscopic level, the theory of 
reaction rates describing processes activated by crossing potential barriers. 

• Stress corrosion occurs in the presence of pre-existing cracks in quartz, quartz rocks, calcite rocks, 
basaltic rocks, granitic rocks and many other geological materials [Atkinson et al., 1981; Atkinson, 
1984] by the mechanism of hydrolytic weakening [Griggs et al, 1957; Griggs and Handin, 1960] 
which is also thermally activated (see [Somette, 1999] for a review and references therein). 

• The Ruina-Dieterich state-and-velocity dependent friction law [Dieterich, 1979; Ruina, 1983; 
Scholz, 1998] results physically from creep of the surface contact and a consequent increase in 
real contact area with time of contact [Scholz and Engelder, 1976; Wang and Scholz, 1994]. The 
logarithmic form In V of the velocity dependence of the friction coeflacient in the Ruina-Dieterich 
law is usually assumed to derive from an Arrhenius activated rate process describing creep at 
asperity contacts [Stesky, 1977; Chester and Higgs, 1992; Chester, 1994; Heslot et al, 1994; 
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Brechet and Estrin, 1994; Baumberger, 1997; Sleep, 1997; Persson, 1998; Baumherger et al, 1999; 
Lapusta et al., 2000; Nakatani, 2001]. 

2.2. Long-time memory effects in the relaxation of stress fields 

A recent re-construction of the regional strain rate map in California from GPS recordings over 
a dense network shows that the largest strain rates are controlled by past large earthquakes and are 
found in the regions where aftershock activity is still noticeable [Jackson et al., 1997]. This suggests 
the relevance of a stress-controlled earthquake activation. Post-seismic slip and strain rate relaxation 
following large earthquakes have been modeled by visco-elastic flows, which govern the evolution of the 
stress field and thus the loading and unloading processes of major earthquake generating faults [Deng 
et al., 1998]. The simplest models assume linear visco-elastic rheologies, which lead to exponential 
strain relaxation. These models in general account for the short-term relaxation processes over a time 
scale ranging from a few months to one year [Pollitz et al., 2000]. Over long time scales, it is necessary 
to take into account the presence and geometry of lower crustal and mantle shear zones, which lead to 
more complex and slower decaying relaxation rates [Kenner and Segall, 1999]. To adequately model 
long-term postseismic relaxation (i.e. that occurring years to decades after an earthquake) for instance 
after the Landers and Hector Mine earthquakes, it was found necessary to add other slower deformation 
processes [Freed and Lin, 2000; 2001]. Evidence of large post-seismic relaxation times have been 
found within the crust in the case of slow-rate intracontinental events [Calais et al, 2002]. Long-term 
stress relaxation process are also found in empirical laws of creep experiments applied to model the 
brittle creeping fault zone, which can account for both the time evolution of afterslip, as measured 
from geodesy, and of aftershocks decay. In this framework, aftershock sequences and deep afterslip, as 
constrained from geodetic measurements, follow the same temporal evolution [Perfettini and Avouac, 
2004]. Generally, slower-than-exponential relaxation is found in disordered materials which can be 
characterized by nonlinear rheologies [Klinger, 1988; Chung and Stevens, 1991; Phillips, 1996]. 
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3. Formulation of the "multifractal stress activation" model 

3.1. General case 

Let us denote by A(r, t) the intensity (or average conditional seismicity rate) at position r and 
time t. Putting together the two physical ingredients of rupture activation and stress transfer discussed 
above, we formulate the following model. The thermal rupture activation process is expressed as 

A(f,t)~exp[-/3i;(r,t)] , (1) 

where /3 is the inverse temperature (specifically /3 = 1/ kT where k is the Boltzmann constant) and the 
energy barrier E{f,t) for rupture can be written as the sum of a contribution -E'o(r) characterizing the 
material and of a term linearly decreasing with the locally applied stress T,{f,t) [Zhurkov, 1965]: 

E{f,t)=Eo{f^-VJ:{f,t) . (2) 

y is a constant which has the dimension of a volume and S(f, t) is the total stress at position f and 
time t. The decrease of the energy barrier E{f,t) as a function of the applied stress S(r, t) in (2) 
embodies the various physical processes of stress corrosion, state-and-velocity dependent friction and 
mechano-chemical effects, aiding rupture activation under stress. The stress T,{f,t) results itself from 
all past events according to 

S(f, t) = Sfor field (r, t)+ f I dN[df ' X dT]Aa{f T)g{f- r',t-T) . (3) 

J —CO J 

Putting all this together yields 



\{f,t) = Atec(^t)exp 



(3 j j dN[df'xdT]Aa{f',T)g{f-f',t-T) 

J — OO J 



(4) 



In this expression, the term /3 now incorporates the volume term V and the inverse temperature 
(/3 = V/kT), so that /? has now the dimension of the inverse of a stress. The double integral gives the 
stress at position r and time t as the sum of the stress load contributions over all past earthquakes at 
earlier times t < t and positions r '. A given past event at r ' and time t contributes to the stress 
at r and time t by its stress drop amplitude Aa{f ', r) which is transfered in space and time via the 
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stress kernel (or Green function) g{f — f" ,t — t), taking into account both time relaxation and spatial 
geometrical decay. The kernel ^(r — f ' ,t — t) describes the combined effects of all stress relaxation 
processes in space and time, that determine the stress field in the seismogenic layer, as discussed in 
section 2.2. The term dN[df ' x dr] is the number of events in the volume df ' that occurred between 
T and T + dr. Finally, Xtec{'i^,t) is the spontaneous seismicity rate in absence of stress triggering by 
other earthquakes and accounts for the tectonic loading (far field stress), which may in general be 
non-homogeneous and space and perhaps depends on time. 

It is convenient to discretize space in cells and rewrite (4) as 



where A,(t) is the average conditional seismic rate in cell i at time t, Acrj(r) is the stress drop in cell j 
that occurred at time r due to an earthquake and gij [t — r) measures the fraction of the stress drop 
that occurred at time r in cell j which is transfered to cell j at time t. 

Our model (4,5) is reminiscent of the stress release model, but there are several important 
differences that are worth noting. The single cell stress release model was introduced by Vere-Jones 
[1978] as a stochastic implementation of Reid's theory of elastic rebound theory [Reid, 1910]. The 
generalization to account for long-range elastic stress transfer was done in [Zheng and Vere-Jones, 1991] 
and in [Liu et al., 1998; Shi et al., 1998] (see [Bebbington and Harte, 2003] for a review and references 
therein). The most general form of the stress release model (SRM) reads 



In the SRM, the tectonic loading increases the stress at cell i linearly in time according to bit and 
earthquakes on this cell and elsewhere relax (or load) the stress at cell i. Sj{t) = '}2iT<t ^^ji''') 
cumulative stress release over all past earthquakes that occured in that cell j. Sj{t) impacts region i 
through the time- independent coupling (or stress transfer) coefficient gij. In our model, the tectonic 
loading appears in contrast through the rate Xtec{r,t). The SRM views the earthquakes as mostly 
unloading this tectonic stress (of course stress load is possible) while we view the earthquakes more 






(6) 
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symmetrically as both promoting or shadowing the seismic activity elsewhere around an average rate 
controlled by Xtec{r, t). Finally, the most important difference lies in the fact that the SRM assumes an 
infinite time memory of past stress releases in the cumulative stress release function Sj{t), while we 
view the stress transfer by each past earthquake as transient due to visco-elastic processes in the crust 
and mantle. This infinite time memory of the SRM is made necessary to compensate for the tectonic 
loading hit in order to obtain a statistically stationary process. In contrast, our model is better devised 
to deal with transient stress perturbations induced by past earthquakes. As a consequence, the SRM is 
not built to produce aftershocks (see however the two-cells version of [Borokov and Bebbington, 2003], 
which does produce first generation aftershocks obeying the Omori law). The fits of seimic catalogs to 
the SRM indeed use declustered data. Thus, the fundamental difference between our model (4,5) and 
the linked SRM (6) is that the later does not describe either aftershocks or delayed triggered seismicity 
(Imoto et al. [1999] introduced a fixed one-time delay in order to produce periodic-type behavior). Our 
model (4,5) is an important generalization to account for the delayed triggering processes, which have 
been found to explain many phcnomenological observations of seismicity [Helmstetter and Sornette, 
2003a] . Actually, all the results that we derive below derive from the time-dependent memory kernel of 
our model, which are thus absent in the SRM. 

3.2. Reduction to time-only dynamics 

Starting from (4), we write the space- time kernel g{f,t) in a separable form 

g{f,t)=f{r)xh{t). (7) 

This choice is made for the sake of simplicity and is in the spirit of the specification of the ETAS which 
also assumes separability of the bare kernel in space and time. Helmstetter and Sornette [2002b] have 
shown that the cascade of triggering of events which are decoupled in time and space in their first 
generation eventually leads to a coupling in space and time corresponding to a sub-diffusion process. 
Here, we expect a similar mechanism to operate when taking into account all generations of earthquake 
triggering (a mainshock generates aftershocks of first-generation, which themselves trigger aftershocks 
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of second-generation and so on). 

With the separable form (7), expression (4) can be transformed into 



A(f,i) = Atec(r,i) exp 

where 



P dr s{f,T)h{t-T) 



(8) 



s(f, r) = dr" Aa{r ', r) /(f - r ') (9) 



is the effective source at time r at point r resulting from all events occurring in the spatial domain at 
the same time r. The separable form of the kernel g{f, t) in (7) allows us to study the time-dependent 
properties of the model, independently of its space properties. Since expression (8) is defined for any r, 
if we assume space homogeneity, or if we restrict to a specific domain, we can drop the reference to r 
without loss of generality and get 



\{t) = Atec(i) exp 



13 I dr s{t) h{t - r) 

J —OO 



(10) 



In contrast with the ETAS model in which the time-only equation of the conditional Poisson rate 
describes the seismicity integrated over all space, here the time-only equation (10) refers to a specific 
location. If space is homogeneous {\if,c{r,t) and s(r, r) are independent of f), then equation (10) gives 
the conditional seismic Poisson intensity at any point. 

3.3. The distribution of stress source strengths 

The fact that the source is given by (9) should allow us to constrain its statistics. Indeed, 
Kagan [1994] has suggested using theoretical calculations, simulations and measurements of rotation of 
earthquake focal mechanisms that the stress change in earthquake focal zones due to past earthquakes 
should follow the symmetric Cauchy distribution 

= i ^ , (11) 

or perhaps even distributions decaying as power laws with even smaller exponents. The Cauchy 
distribution is a stable Levy law with power law tail exponent fj, = 1 and can be shown to be the 
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distribution for the stress at any point due to a random uniform distribution of sources mediated to that 
point via the elastic Green function [Zolotarev and Strunin, 1971; Zolotarev, 1986] (see also [Sornette, 
2004], chap. 17, for a general presentation). The physical mechanism for the Cauchy distribution is thus 
precisely the summation (9) over earthquake sources Aa{f ') with stress transfer given by the elastic 
Green function f{f— f ') in the crust. The Cauchy exponent n — 1 is obtained for a uniform spatial 
distribution of sources in 3D with the elastic Green function ~ 1/r^ in 3D, or for a uniform spatial 
distribution of sources in 2D with the elastic Green function 1/r^ in 2D. For sources on a fractal with 
dimension £)/ < 3, /x = -D//3 in 3D [Kagan, 1994]. 

However, it must be kept in mind that the large values of the stress sources s that contribute to 
the slow power law decay of the Cauchy distribution result from the assumption that earthquakes are 
point-wise such that a probe put at random in the medium can come arbitrarily close to these sources: 
it is the divergence ^ 1/r^ (in 3D) of the stress field close to such a singular source which is at the 
origin of the Cauchy distribution (see Chap. 17 of [Sornette, 2004]). In reality, such singular power law 
behavior transforms into a much weaker l/^A" singularity close to crack tips and then crosses over to a 
smooth behavior due to plasticity and damage that smooth out the singularity sufficiently close to the 
fault edges. 

To capture in a phenomenological way these features, we will use a power law distribution of the 
source strengths 

(S^ + Sq) 2 S 

where the scale factor C is given by C oc Sq and sq is a characteristic scale proportional to the average 
stress drop. The value = 1 recovers the special case of the Cauchy distribution advocated by Kagan 
[1994]. Notice that the distribution of the source strengths is symmetric, implying that the effect of an 
earthquake in the past can either enhance or shadow the present seismicity. This generalizes the ETAS 
model as both stress triggering and stress shadowing are taken into account symmetrically while the 
ETAS model describes only stress triggering. 
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3.4. Time-dependent stress relaxation kernel 

The last ingredient we need to specify is the time dependence of the memory kernel h{t). We 
postulate a stress relaxation function 

which is of the Omori form with the usual small time-scale cut-off c. To ensure convergence of the 
correlation function of deterministic processes with memory governed by h{t) for any possible values of 
6, we truncate the power law in (13) at some large time T, which we call the "integral time scale:" it 
is the largest time scale up to which the memory of a past event survives. T can thus be interpreted 
as the effective Maxwell time of the relaxation process. The sharp cut-off implied by T is invoked for 
convenience and can be replaced by a smooth cross-over using for instance an exponential roll-off of the 
form 

= (fT^ • 

without changing our main conclusions below. 

Just after an event over a time scale slightly larger than the time for the propagation and 
attenuation of dynamical stress waves, that we note t = 0+ for short, the stress equilibrates to its static 
value and we should have 

= ^ = 1 > (15) 

to express that the static stress has not had time to relax yet. 

Expressions (13) or (14) can be rationalized from the time dependence of the visco-elastic Green 
function in ID which gives 9 = —1/2, in 2D which gives ^ = or in 3D which gives 6 = 1/2. In fact, 
a better formulation will require a space-time dependence of the evolution of the stress field. Another 
argument is to view to local stress as proportional to the local strain rate, which is itself proportional to 
the local microscopic seismic rate which obeys the Omori law. Prom a micro-mechanical point of view, 
such slow relaxation process (13) are associated with dislocation motion, stress corrosion and hydrolytic 
weakening processes [Sornette, 1999]. We also would like to underline that expression (13) implies 
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the absence of a well-defined characteristic time scale for c < t < T where c <^T, and embodies the 
complex non-Maxwell relaxation processes in the crust, its coupling with the lower visco-elasto-plastic 
crust and more ductile upper mantle. The coexistence of many different time scales can be captured by 
such power law decay. 

Actually, there is a more profound origin of the dependence (13) of the stress relaxation. Assume 
that an earthquake loads a given region according to the elastic stress redistribution that can be 
estimated using standard methods of elasticity [King et al, 1994; Stein, 2003]. Then, by the activation 
processes discussed in section 2.1, this stress redistribution induced by the "mainshock" will give rise 
to an increase of seismicity at that region. The triggered earthquakes will lead to new sources of stress 
redistribution, which themselves modify the stress field, tending to decrease it. This process gives 
rises to a slow power law relaxation of the stress field [Lee and Sornette, 2000] . The power law decay 
embodied in (13) can thus be viewed as resulting from the microscopic process of stress redistribution 
and relaxation which occur below the scale of observation. There is no reason for the physics to change 
and this law (13) is an effective renormalization of many microscopic relaxation processes. 

3.5. Summary of the model definition 

Summarizing, in a discrete form, our model reads 



where the stress at time t is the sum of the contributions over all previous earthquakes that occurred 
at times ti < t, with stress sources given by the power law distribution (12) and with a power law 
time-dependent stress relaxation kernel (13). In the sequel, we take a constant seismic rate Atec in the 
absence of stress perturbation s{ti) = 0. 
It is convenient to rewrite (16) as 





(16) 



i I ti<t 



A(i) = Atec e' 



,/3w(t) 



(17) 
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where 



(18) 



i I ti<t 



Model (16) belongs to the class of nonlinear self-excited point-processes, in which the nonlinear 
function is the exponential in our case [Bremaud and Massoulie, 1996; Bremaud et al., 2002]. Bremaud 
et al. [2002] have given the general condition guaranteeing the existence of a statistically stationary 
solution in the case of unbounded nonlinear function, sub-exponential distribution of the marks (stress 
changes s's) and long memory kernel as is our case. In our model, the introduction of the integral time 
scale T in (13) ensures the existence of the correlation function for any process and for any values of 
the exponent 6. 

4. Derivation of the magnitude dependence of Omori's law 

In this section, our goal is to derive Omori's law from model (16) with (12) and (13). The problem 
can be formulated as follows. Omori's law quantifying the decay of seismic activity after a "mainshock" 
occurring at the origin of time amounts to determining the typical time dependence of X{t) conditioned 
on a value Am realized at t = which is larger than average. This is due to the fact that a mainshock 
of magnitude M induces a local burst of seismic activity proportional to K 10"^, where K and a 
are two positive constants [Helmstetter, 2003]. We note however that previous determinations of the 
productivity exponent a have assumed the constancy of p with mainshock magnitude; in the presence 
of an exponent p{M) which increases with M as we find here, past values of a have probably be 
underestimated. 

4.1. Theory of Omori's law by generalization of conditional expectations of seismic rates 
to power law 

In the case where averages exist, the Omori law can be expressed in the following general form: 



E[A(t)|AM] = AtecE[e'^'^(*)|a;M] , 



(19) 
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where E[.] denotes the mathematical expectation or average of ensemble of equivalent statistical 
realizations of the process. 

In the case where averages and variances and covariances are ill-defined mathematically, as is the 
case for Cauchy distributions and for power laws with /i < 2, a typical measure of conditional seismicity 
rate can be defined at any quantile level q by the probability Pr[A(t) > A^IAm] that the rate X{t) be 
larger than the quantile Ag conditioned on the fact that the seismic rate was at some given value Am at 
time 0: 



Pr[A(t) > A,|Am] = Pr[e^"(*) > -^\ujm] = Pr[u;(i) > (1//3) In ) \ujm] ■ (20) 



^|u;M]=PrMi)>(l//3)lnf^ 

^tec V Atec , 

We are interested in monitoring the time evolution Xq{t) of the seismic rate quantile at some probability 
level q (which can be varied to explore different fluctuation levels). 

If the source terms s{ti) were centered Gaussian random variables, w would also be normally 
distributed. Using (19), this would allow us to obtain 



E[e'''^W|wM] =exp 

where 



2 



/3E[w(t)|wM] + ^VBx[i0{t)\i0M] 



(21) 



Var[a;Mj 

Using the definition (18), this would provide a closed formed expression for the Omori law describing 
the relaxation of the conditional rate E[A(t)|AM]. The physical meaning of (22) is that one can write a 
linear regression 

w{t) = j{t)ujM + e , (23) 
where ^(t) is a non-random factor and e is a centered Gaussian noise with zero correlation with ojm- 
Equation (23) writes that the best predictor of w given um is 7Wm, i-e., E[a;(f)|a;M] = 7Wm with 

_ Coy[u{t),u;M] . . 

Var[a;M] ' ^ ' 

which retrieves (22). 

However, as we explained above, the sources are distributed according to a distribution (12) which 
can be expected to have a heavy tail, such that both its variance and average are not mathematically 
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defined, or if they are defined converge very poorly to their asymptotic values even in large data sets. 
Thus, the standard statistical tools of expectation, variance and covariance can not be used and we 
need a completely novel approach to tame mathematically these wild fluctuations. For this, we use 
the insight that the natural generalization of the variance for power laws p{x) w C/x^~^^ with infinite 
variance (i.e., with < 2) is the scale parameter C, as it possesses most of the properties of the variance 
for Gaussian random variables: it is additive under convolution of the distribution and it replaces the 
variance in the expression of the characteristic function of the distribution (see Chap. 4 of [Somette, 



In the power law case, due to the linear form of (18), we can still write (23) but with w{t),u)M and 
e being power law distributed random variables with the same exponent /x and with scale factors equal 
respectively to C^j (for oj and (Jm) and C^. The key idea is that 7 can be determined by a generalization 
of (24), involving generalizations of the covariance and variance. This generalization consists in forming 
the random variable defined as the product uwm = + ^'^m- It is straightforward to show that 
the distribution of loum consists of two main contributions, (i) a dominant power law with exponent 
/i/2 and scale factor C^^m = l**^^ C'o,, and (ii) a sub-dominant power law with exponent (with a 
logarithmic correction) and scale factor C^Ce- This has the following practical implication: if one 
measures or calculates the leading power law decay of a; x lom , the measure of its scale factor gives 
access to the parameter 7 through the expression 



where the time dependence of j{t) comes from that of C^um as we show below. 

This expression (25) generalizes the standard result (24): notice that the case fi = 2 recovers (24) 
with the correspondence C^, = Var[ci;] and C^^u^m = Cov[ci;(t), wm]- This is expected since, as we said 
above, the scale factor reduces to the variance for the Gaussian distribution and the stable Levy law 
with exponent fi = 2 turns out to be nothing but the Gaussian law! We note that this method consisting 
of generalizing the covariance by introducing the concept of "tail-covariance" has been previously used 



2004]). 



2 




(25) 
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to extend the Kalman filter of data assimilation to power law distributed noise processes [Sornette and 
Ide, 2001]. 

Using (18), we form the product 

oj{t)ujM= 12 s{ti) s{tj)h{t-ti)h{-tj) , (26) 

i I ti<tj I tj<0 

where the s's are random variables with power law tail with exponent /U (specifically, in this paper, the 
s's are Cauchy variables with = 1 but we give the derivation for any value of /i). Then, using standard 
calculations (see Chap. 4 of [Sornette, 2004]), the terms in the double sum in (26) that contribute 
to the leading asymptotic power law tail with exponent /Lt/2 correspond to the diagonal terms i = j, 
while all the other terms contribute to the sub-leading power law tail with exponent n with logarithmic 
corrections. This gives the expression of the scale factor cijtl^ of the dominating power law with 
exponent /z/2 

Cit!^^=Cu, E [h{t-U)h{-ti)]^ . (27) 

i I ti<0 

Together with (25), this yields 

7 = ( E [Kt-U)h{-t,)\'i\ . (28) 

\i I *i<0 / 

Since h is dimensionless, 7 is also dimensionless, as it should from its definition (23). 

In order to perform a theoretical analysis of (28), it is convenient to transform the discrete sum 
into a continuous one. Let us consider the times ti's in the sum J2i \ ti<o ™ i^^)- Ideally, these times 
ti should themselves be determined self-consistently and are known to follow on average an inverse 
Omori law. However, such an inverse Omori law is a statistical property observed only for a large 
ensemble of stacked foreshock sequences while individual sequences exhibit approximately constant 
seismic rates [Helmstetter and Sornette, 2003a]. In order to simplify the analysis, we thus approximate 
the seismicity prior to a mainshock as being approximately constant in time and uniform in space. 
This allows us to introduce the average time interval At between two events preceding the mainshock. 
We expect At to be of the order of the small time-scale cut-off c in (13). This is natural if the stress 
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relaxation is in significant part due to seismicity itself: then, c is the time scale beyond which the stress 
relaxation starts to be felt, i.e., when new earthquakes occur in the vicinity of the source. Then, we can 
approximate the discrete sum as follows: 

r° dt 

ti<0 

Using the form (13) for the stress relaxation kernel gives in continuous time after some manipulations 

= I ^ ^ \ 1 A ^ 

- ^^2/^ yt{l+0)^.-l {X + 1 + (c/f))(i+«)'^/2 (x + (cA))(i+«)A'/2 ) ' ^^^> 

where ho = c^+^ according to (15). We verify that 'y{t) is dimensionless as it should. Under the change 
of variable x ^ y = x + j, expression (30) can be written 

= Sr. {-d^ ^ '^(^^) ' ^''^ 

where m = (1 + ^)/x/2. 

We now have all the ingredients to estimate (20). We thus obtain 

Pv[u}{t) > y\uiM] = Pr[7WM + e > vIc^m] = Pr[e > y- 7Wm|wm] = F{y - 7(t)wM) , (32) 

where F{e) is the complementary cumulative distribution of e. Using (32) in (20), this leads to 

Pr[A(t) > A^IAm] = F f (1//3) In ( ^] - j{t)u;M) • (33) 



The typical time evolution of the seismicity rate A(t) conditioned on the rate Am at time is thus given 
by fixing the quantile probability to some level Pr[A(t) > A^IAm] = q, leading to 

^ln^-^^-j{t)u;M = F-\q), (34) 

or equivalently 

Xg{t) = A, Atec e'^^(*)'^" , (35) 

where Ag = exp (^/3F~^{q)). The time-dependence of the seismic decay rate is thus determined by (35), 
which requires the determination of the time-dependence of j(t) given by (30) (and more generally by 
(28)). 
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In the case where the stress relaxation is a pure exponential h{t) ~ e~*/-^, then using (30), we 
find 7(t) ~ e~*/^ also and the seismic rate Xq{t) given by (35) also relaxes exponentially to a constant 
background. The novel effect that we describe below comes from the interplay between the exponential 
thermal activation process and the long-memory process of the stress relaxation. 

Note that this approach holds also for jj, > 2 for which variances and covariances exist, which can 
allow the application of (24). However, using this power approach provides more robust estimators of 
the typical values of seismic rates when the convergence of the mean and of variances are very slow, as 
occurs for power laws (see Chap. 3 of [Sornette, 2004]). 

4.2. Study of the predicted seismicity rate for different values of the two key exponents /i 
cind 6 

Our purpose is to show that, for a rather broad range of values of the exponents fi and defining 
the model, Xq{t) is approximately given by 

^«(*) - Jm) ' (36) 

with 

p{M) = apM + b(3 , (37) 

where a > and M is the mainshock magnitude. In a nutshell, expression (36) with (37) result from 
the interplay between the heavy-tailed distribution of stress perturbations and the long time memory of 
the stress relaxation on the one hand and the exponential dependence of the seismic rate on the stress 
field on the other hand. 

4.2.1. Case 6 = —1/2 and /U = 2 (or with a Gaussian distribution of stress source 
strengths) 

This interplay between the long time memory of the stress relaxation and the exponential 
dependence of the seismic rate on the stress field is exemplified by the case = —1/2 and /U = 2, which 
has previously been shown to give exact multifractal properties in the time domain [Muzy and Bacry, 
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2002] , and as a consequence continuous dependence of the relaxation exponents p{M) as a function of 
shock magnitudes M. This continuous dependence of the exponent p{M) has actually been documented 
empirically in this case in another context of aftershock decay following shocks in financial markets 
[Sornette et al., 2003]. Schmitt [2003] has studied in details the properties of the process (17) with (18), 
when s(t) is a Gaussian white noise (corresponding formally to = 2), with the memory exponent 
61 = -1/2. 

An intuitive grasp of this behavior is obtained by examining d-y/dt: 



d7_ (feg/At^/^) 
dt t 



(38) 



.(T + C-t)V2 (f + c)l/2 

It is easy to see that the first (resp. second) term of the bracket in the r.h.s. is always larger 
(resp. smaller) than 1, which ensures that ^ is always negative. In addition, for t < T, the 
bracket in the r.h.s. is almost constant and close to 1, showing that dj/dt w — 1/^ and thus 
7(t) w constanti — constant2 x ln{t/T). Then expression (35) leads to (36,37) using the fact that 
COM oc In(AM) oc \n{K 10"^) = alnlO M + lnK, i.e., lum is linearly related to the magnitude M. Here, 
we have used the productivity law that an earthquake of magnitude M produces on average a number 
of aftershocks proportional to the exponential of its magnitude (a is the productivity exponent often 
reported between 0.5 and 1). 

4.2.2. Condition 2m = ^{1 + 9) = 1 

This case corresponds to taking the exponent m = (1 + 0)n/2 defined in (31) equal to 1/2. 

I fj,/2 J ij,/2 

Then, -^^j— has exactly the expression given by the right-hand-side of (38), showing that 
is close to —1/t, and thus 7''/^(t) ~ constanti — constant2 ln(t/T) which, for not too small nor 
too large t's and for constanti < constant2, gives 7^t) « constant'^ — constantj x \n{t/T). This 
yields (36,37). Typically, the power law behavior is observed over more than two decades in time, 
which is comparable to empirical observations. Figure 1 shows the numerical evaluation of 7(f) 
as a function of \n{t/T) for several pairs (/x, 0) which obey the condition /x(l + 0) = 1 exactly: 
(^ = 3,61 = -2/3); (/i = 2,61 = -0.5); (/x = 1,6' = 0); (/x = 2/3, 6» = 0.5); (/x = 0.5,61 = 1), using 
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c/T = 10~^. We verify the existence of a linear decay of 'y{t) in the variable \nt, which qualifies an 
Omori power law (36). The range of this Omori power law regime depends rather sensitively on the 
temperature, on the mainshock magnitude and on the ratio c/T since the logarithm of the seismic 
rate is proportional to 7 via a coefficient of proportionality involving the inverse temperature and the 
mainshock magnitude, as shown in (35). Then, by the same argument as in section 4.2.1, the linear 
dependence of wm on M in expression (35) leads to (36,37). 

The fact that j{t) is asymptotically exactly logarithmic in time for 2m = /x(l + 6) = 1 and 
thus that the seismic rate X{t) is an Omori power law can be recovered from a different construction 
motivated by multiplicative cascades introduced in turbulence. The discrete multiplicative cascade 
model of hydrodynamic turbulence can be extended in distribution to the continuous limit [Schmitt 
and Marsan, 2001] and shown to take the form (18) expressed with the discrete sum replaced by a 
continuous integral. Then, Schmitt and Marsan [2001] show that the only condition for exp[a;] (in our 
case A through (17)) to be "logstable multifractal" is that the exponent /j. of the distribution of the 
innovations s and the exponent 1 + of the memory kernel be related by the condition /i(l + 9) = 1 
that we have derived above through a difi^erent route. In a nutshell, their argument (which applies in 
distribution to one-point statistics but not to multi-point statistics nor in process) is as follows. The 
logarithm of A is constructed as an integral in the time-scale (t, a) domain within a cone with apex 
on the time axis of identically independent distributed innovations distributed according to a Levy 
distribution with exponent fj,. The natural scale-invariant measure for this integral in this time-scale 
domain is dadt/a^ in the sense that it is (left-)invariant by the translation-dilation group [Muzy and 
Bacri, 2002]. The contribution of a vertical strip of width dt = £ centered at time t — t (i.e., at a 
distance r from the present time t) within the cone is ei /t^^ '^^ /^^] — £i(^/(*~'''))^^'') where ei is 
a Levy distributed noise with scale factor unity. The exponent l//x comes from the dependence of the 
scale factor of sum of the £/{t — r) Levy variables that contribute to each trip. Now, in distribution, the 
total integral is thus equal to the sum of all strips up to time t, which gives dr ei{T){£/{t — r))^/''. 
This expression is exactly of the form (18) with 1 + 9 = 1/ ji. QED. 
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4.2.3. Case 2m = + 6') 7^ 1 



It is useful to study the generalization of (38), which reads 



dt 



(39) 



Two cases must be distinguished. 
• For 2m = /x(l + 6') < 1, the first (resp. second) term of the bracket in the r.h.s. of (39) is always 



for t < T, the bracket in the r.h.s. is almost constant and close to (T + c)^ showing that 
d[f2™-i^M/2]/rf^ ~ and thus 7^/2(t) « constanti x (T/f)2'"-i + constant2 x (T/t)'". 7 

is thus a convex decreasing function of Int (with a upward curvature) as shown in figure 2. The 
mathematical form of the decay rate tends to slow down compared with a standard Omori power 
law. This convexity tends to linearity as m — > 1/2. 

• For 2m = /u(l + 6) > 1, we have 1 — (m/2) < m/2. As a consequence, for small t's, 

(T + c — t)™/^ > (T + c)^"'™/^) while the reverse inequality holds true for large Vs. This reasoning 
shows that the bracket is negative for small t's and becomes positive for large t's. Therefore, 



large t's. This translates into 7 increasing for small t's, passing through a maximum before 
decreasing for large t's, as shown in figure 2. In this case, we can often observe an approximate 
linear decay of 7(f) as a function of Int, over two to three order of magnitudes in time in the 
decaying part beyond the maximum, all the more so, the closer m is to one. Over this range, by 
the same argument as in section 4.2.1, the linear dependence of wm on M in expression (35) leads 
to (36,37). 




is positive for small t's, vanishes at an intermediate time and becomes negative for 



dt 
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5. Empirical investigation of Omori's law conditioned on 
mainshock magnitudes 

5.1. Data selection and analysis of completeness 

We use the Southern Californian earthquakes catalog with revised magnitudes (available from 
the Southern California Earthquake Center) as it is among the best available in terms of quality and 
time span. Magnitudes are given with a resolution of 0.1 from 1932 to 2003, in a region from 
approximately 32° to 37°N in latitude, and from -114° to -122° in longitude. 

Using all the data of this catalog, we compute the complementary cumulative magnitude 
distribution for each year from 1932 to 2003 included. This gives 72 Gutenberg-Richter distributions 
which are shown in Figure 3. The logarithms of the distributions as a function of magnitude are 
approximately linear for the largest magnitudes and exhibit a cross-over to a plateau at small 
magnitudes. In order for our analysis to be robust, we need to address the question of completeness 
of the catalog. The level of low-magnitude plateau increases with time, which is the consequence of 
the evolution of the seismic network: as more stations are added, the spatial coverage increases, so 
that the number of events with fixed magnitude also increases. One can also observe that events of 
smaller and smaller magnitudes are detected and located when going from 1932 to the present. This 
also leads to an increase of the number of events recorded in the catalog. As the typical time scale over 
which we analyze the Omori law decay is about 1 year, which is small compared with the average time 
scale of the evolution of the network coverage (see Figure 3), we will consider all events in the catalog 
that have a magnitude that belong to the linear part of the magnitude distribution curve. Figure 4) 
shows several complementary cumulative distributions of earthquake magnitudes for the four years 
1932, 1975, 1992 and 1994. Taking the linear relationship between the logarithm of the distribution 
as a function of magnitude as the standard criterion for completeness [Kagan, 2003], we infer that the 
catalog is approximately complete for Ml > 3 in 1932 and later years, for Ml > 2.5 in 1975 and later 
years, for Ml > 2 for 1992 and later years, and for Ml > 1.5 in 1994 and later years. Since our fits of 
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aftershock decay rate are performed typically over a time range starting beyond one day and extending 
to several hundred days, the short-term lack of completeness identified by Kagan [2004] is not an issue 
here. Actually, the problem of the short-term completeness of aftershock catalogues is the most acute 
for large mainshocks [Kagan, 2004], which allows us to extend the study of the seismic decay rate at 
shorter times for the small mainshock magnitudes. 

In order to maximize the size of the data used for our analysis (to improve the statistical 
significance) and to test for the stability of our inversion, we will consider four different sub-catalogs: 

1. 1932 - 2003 for Ml > S (17, 934 events), 

2. 1975 - 2003 for Ml > 2.5 (36, 614 events), 

3. 1992 - 2003 for Ml > 2 (54, 990 events), and 

4. 1994 - 2003 for > 1.5 (86, 228 events). 

The fact that the number of events in these sub-catalogs increases as their time-span decreases is due 
to the lowering of the minimum magnitude of completeness which more than compensates the reducing 
time interval. 

5.2. Construction of the time series of aftershocks 

5.2.1. Time-shifted stacked sequences: general principle 

We now describe the method used to determine the validity of the Omori law and to measure the 
p-value as a function of the mainshock magnitude. 

• We consider all events in a given sub-catalog and discriminate between mainshocks and triggered 
events ("aftershocks"). Mainshocks are determined by using declustering methods which are 
described below in sections 5.2.2 and 5.2.3: to test for the robustness of our results and their lack 
of sensivity with respect to the specific choice of a declustering method, we present two different 
implementations. 
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• Once the mainshocks are determined, triggered events are defined as those events following 
a mainshock, which belong to a certain space-time neighborhood of it. This space-time 
neighborhood is specified below. We thus have as many triggered ("aftershock") sequences as 
there are mainshocks. 

• In order to test for the predicted dependence of the p-value as a function of magnitude, we need 
to bin the mainshock magnitudes in intervals [Mi; M2]. We have chosen the intervals (when 
available) [1.5; 2], [2; 2.5], [2.5; 3], and so on up to [7; 7.5]. 

• In each mainshock magnitude interval [Mi ; M2] , we consider all triggered sequences emanating 
from mainshocks with magnitude in this interval [Mi;M2]. We translate each triggered sequence 
so that their origin of time (their mainshock time) is moved to the common value t = 0. Then, we 
stack all these sequences to obtain a mega-sequence containing all triggered events coming from 
mainshocks in the same magnitude interval [Mi ; M2] . 

• We then bin the time axis according to a geometrical series, and estimate the average rate of 
events within each bin. The resulting function is fitted using the modified Omori law 

N{t) =B + , " ^ , (40) 
^ ' {t + c)P ' ^ ' 

where B is a positive parameter introduced to account for the background seismicity assumed to 
be superimposed over the genuine triggered sequences. The time shift c ensures the regularization 
of the seismic rate at t = 0. 

Varying [Mi; M2] allows us to test for a possible dependence of p as a function of the magnitude of 
the mainshock. In order to (i) maximize the size of the data sets, (ii) avoid bias due to incompleteness 
of the catalogs, and (iii) test the robustness of our conclusions, we will use 

1. for Ml > 3: four different sub-catalogs (1932 - 2003, 1975 - 2003, 1992 - 2003, 1994 - 2003), 

giving four estimates for the p-value; 
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2. for [Ml = 2.5; M2 = 3]: three sub-catalogs covering 1975 - 2003, 1992 - 2003 and 1994 - 2003, 

giving three estimates for the p- value; 

3. For [Ml = 2; M2 = 2.5]: two sub-catalogs covering 1992 - 2003 and 1994 - 2003, giving two 
estimates for the p-value; 

4. For [Ml = 1.5; M2 = 2]: one sub-catalog covering 1994 — 2003, giving a single estimate for the 
]> value. 

To explore further for the robustness of our estimates for p, we will also impose c = to test for 
the sensitivity with respect to the beginning of the stacked time series. We will also remove bins at 
random and re-estimate the p-values of these pruned stacked sequences. Putting all these estimates 
together for a given magnitude interval [Mi;M2] allows us to obtain a mean value and a standard 
deviation. Note that for each magnitude interval, we have several thousand events in the decay rate 
function, ensuring an adequate estimation of the p-value. For large magnitudes, there are only a few 
mainshocks contributing to the stacked sequence, but each of them have numerous aftershocks. For 
small magnitudes, each mainshock contributes few triggered events (some have none) but, due to their 
large number, the stacked sequences have also a sufficiently large number of events, even for the smallest 
magnitude intervals. 

We now present the two declustering methods that we have implemented to select the mainshocks. 

5.2.2. 1st declustering technique 

The first method is essentially the same as defined in [Helmstetter, 2003]. First, every event in the 
catalog is defined as a mainshock if it has not been preceded by an event with larger magnitude within 
a fixed space-time window T x d, with T = 1 year and d = 50 km. Looking for events triggered by this 
mainshock, we define another space-time window following it. The time dimension of the window is 
also set to 1 year, whereas the space dimension depends on the rupture length of the main event. This 
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spatial window is chosen as a circle of radius equal to the mainshock rupture length 

which is the average relationship between L and magnitude Ml for California [ Wells and Coppersmith, 
1994]. If any event falls within this space-time window, it is considered as triggered by the main event. 
Note that, since the reported uncertainty of events localization is about 5km, we set the size of the 
spatial window to 5km if L < 5km. We have also checked the stability of the results by considering a 
spatial neighborhood of radius 2L rather than L for the triggered events. 

5.2.3. Second declustering technique 

The second declustering technique is the same as the first one, except for one element: the space 
window used for qualifying a mainshock is not fixed to d = 50km but is chosen to adapt to the size 
of the rupture lengths L{Mi) given by (41) of all events of all possible magnitudes ML{i) preceding 
this potential mainshock. In other words, a potential mainshock is selected if there are not preceding 
events within one year in its past, which are at a distance less than twice their own rupture length. The 
space-time window for the selection of triggered events is the same as for the first declustering method. 

This second procedure of declustering is perhaps more natural than the first one, especially for 
small magnitude events. To take an extreme example, in the first method, an event of magnitude 1.5 
can not be selected as a main event if it falls within 5Qkm of a previous larger event, even if the rupture 
length of the latter is much smaller than 50km. This obviously prevents almost all events of magnitude 
1.5 to be considered as mainshocks, and thus biases the statistics. Mainshocks of low magnitudes 
which are selected by the first declustering method will on average occur within areas of very low 
seismicity rate, so that the composite triggered sequence will be under-representative. Moreover, all 
events located at say 51A;m of an event of magnitude Ml = 7.5 are defined as main events, whereas 
they probably belong to its triggered sequence. We can thus expect the first declustering technique to 
penalize heavily the quality of the stacked sequence associated with mainshocks of low magnitudes. 
The second declustering technique do not show such biases. 
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5.3. Results 

5.3.1. Least-square fit of the seismic rate as a function of time with equation (40) 

Figures 5-8 show sets of typical seismic decay rates of stacked sequences for several magnitude 
intervals of the mainshocks. Only half of the individual Omori curves are drawn for the sake of clarity, 
as in all similar figures in the remaining of the paper. Figure 5 (respectively Figure 6) corresponds to 
the period from 1932 to 2003 when using the first (respectively second) declustering technique, with 
mainshock magnitudes above Ml = 3. Figure 7 (respectively Figure 8) corresponds to the period 
from 1994 to 2003 when using the first (respectively second) declustering technique, with mainshock 
magnitudes above Ml = 1-5. Very similar plots are obtained for diff'erent time periods and by varying 
the size from L to 2L of the spatial domain over which the triggered sequences are selected. For large 
mainshock magnitudes, the roll-off at small times is due to the observational saturation and short-time 
lack of completeness of triggered sequences [Kagan, 2004]. In some cases, one can also observe the 
cross-over to the constant background at large times. 

Figure 9 shows the fitted p-values as a function of the magnitude of the mainshocks for each of the 
four sub-catalogs (the abscissa corresponds to the magnitude (Mi -|- M^)/^ in the middle of the range 
[Mi; M2] for each interval). We use a standard least-square fit of the seismic rate as a function of time 
with a weight proportional to t for each bin to balance their relative importance. We take into account 
the possible presence of a background term as shown in equation (40). 

Figure 10 plots the average j3- values and their error bars as defined above. Both figures exhibit a 
very clear increase of the p-value with the mainshock magnitude. For magnitudes Ml > 3, a linear fit 

p{Ml) = 0.12Ml + 0.28 (42) 

is shown as the straight line in Figure 10. The deviation from this linear behavior for the lowest 
magnitudes Ml < 2.5 can be attributed to the biases associated with the first declustering technique, 
as explained above, when comparing with the results of the second declustering method. 

Figure 11 shows the p- value and its standard deviation as a function of mainshock magnitude 
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obtained with the second declustering method. As with the first declustering method, this data is 
well-fitted by a linear relationship 

p{Ml) = O.IOM + 0.37 . (43) 

5.4. Maximum Likelihood Estimation method 

In order to check the reliability of the p- values obtained in the previous analysis, wc also used 
a maximum likelihood estimation (MLE). We obtain the MLE of the p- value in a finite time window 
from t to tu, by maximizing the probability that the particular temporal distribution of the N events 
in that window results from Omori's law with this particular p value. The most likely p value is then 
given by the implicit equation [Huang et al., 2000] 



1 t'lj'^\nt-tP-'^lntu 



= {lnt)N, (44) 



where the tn are the time occurrences of the N events between t and tu and 

1 ^ 

(lni)w = -^lni„ . (45) 



n=l 

-4 



We fixed fa = 1 year, while t was varied continuously from 10~ to 0.5 year and we solve for p in the 
implicit equation (44). We could thus check both the value and the stability of p with sample size and 
to time boundary effects. 

Figure 12 shows p as a function of t for the post-1932 sub-catalog, using the first declustering 
technique, with R = 2L, for different magnitude ranges. This figure should be compared with Figure 
5. One can observe that, for 3 < M < 3.5, p varies very slightly with time t, with a value of about 0.6, 
whereas the least-square fit gave p = 0.67 for the whole time range (corresponding here to t = 10~^ 
year). For 4 < M < 4.5, p is close to 0.9, while the least-square fit gave p = 0.85. The case 5 < M < 5.5 
is different as the maximum likelihood yields a p-value of about 0.85 whereas the least-square fit 
gave p = 0.96. It should be noted here that the least-square fit detected a significant non-zero 
background value B, and that the MLE method doesn't take into account such a term, which leads to 
an underestimation of p as the time-distribution is flatter at larger times (which is also expressed by a 
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strong drop of p at large time t). The case 6 < M < 6.5 yields p close to 0.8 whereas the least-square 
fit gave p = 1.29, but the MLE method also takes account of events for times larger than 0.1, where 
strong secondary aftershock activity disturbs the distribution and would tend to define a power-law 
with lower exponent. Considering the 7 < M < 7.5 range, where no such bias occurs, the MLE method 
gives p very close to 1 for 10~^ < t < 10~^ whereas p = 0.99 for the least-square fit. The conclusion is 
thus that, when a background B is absent, p-values obtained with both methods agree very well, which 
strenghtens our belief in the reliability of the exponents determined with the least-square fit. This 
least-square fit has here the advantage that it takes into account the possible existence of a non-zero 
background term B. 

Using the same data set and the 2nd declustering method, we obtained Figure 13, which can be 
compared to Figure 6. Once again, exponents agree very well (except for the 6 < M < 6.5 range, for 
reasons explained above). 

We also check the consistency of both methods for the post-1994 catalog. The MLE provided us 
Figure 14 for the first declustering method, which has to be compared with Figure 7. Wc note that 
p-values perfectly agree up to range 3.5 < M < 4. For the next magnitude range, the MLE provides a 
p-value that is continuously decreasing with t. This can be rationalized by a look at the least-square 
fit, which shows that the distribution converges toward a constant background rate at large times, so 
that the ML inversion is systematically biased towards low values. The 5.5 < M < 6 gives p close to 
0.9 compared with 1.07 with the least-square fit. Once again, the existence of non-zero background 
term certainly biases the MLE. The last magnitude range, 6.5 < M < 7 offers the largest discrepancy 
between p-values, but it should be noted that for times larger than 0.01 year, the p- value varies between 
1.15 and 0.8, so that the average maximum likelihood value agrees well with the least-square inversion. 
Even in the worst case {t = 10~^ year), the MLE method even emphasises the variation of p with 
magnitude. 

The results for the 2nd declustering technique on the same sub-catalog are displayed in Figure 15 
(which should be compared with Figure 8). The p- value for the lowest magnitude ranges are quite low 
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(0.2-0.3) but this is once again due to the background rate. The value obtained in the 3.5 < M < 4 
range {p around 0.7) is strongly oscillating. This is indeed also the case for the time distribution shown 
in Figure 8, for which p = 0.57. Note that the MLE method yields p = 0.65 for t = 10~^, close to the 
least-square fit value obtained on the same data range. We have the same comments as above for the 
4.5 < M < 5 which displays a noticeable background term. The next magnitude range gives a p-value a 
bit less than 0.9, whereas the least-square fit gave p = 0.85. The same comments as above apply for the 
6.5 < M < 7. 

Overall, the conclusion is that, when the background rate is absent from the seismic decay, p- values 
inverted from both methods agree very well, and that even when we observe a discrepancy, the ML 
inversion amplifies the variation of p with magnitude. 

5.5. Tests of our procedure with synthetic ETAS catalogs 

To test the reliability and robustness of our results, we now analyze simulated catalogs with known 
statistical properties following exactly the same procedure as for the real catalogs. We generated 
synthetic catalogs using the ETAS model, running the code of K. Feltzer and Y. Gu (spring 2001) 
modified by A. Helmstetter (2003). This model has a fixed magnitude-independent Omori p- value as an 
input. Thus, by construction, synthetic catalogs generated with the ETAS model should exhibit Omori 
laws with magnitude-independent exponents. Applying our procedure to such synthetic catalogs, which 
are known to be very similar to real catalogs in many of their statistical properties [Helmstetter and 
Sornette, 2003] , allows us to investigate whether the magnitude-dependence of the p- value reported 
above could result from some bias introduced by our analysis rather than being a genuine property of 
earthquake catalogs. 

In the ETAS model, a main event of magnitude M triggers its own primary aftershocks according 
to the following distribution in time and space 

^.(r, t) drdt = K 10«(---o) , (46) 

where r is the spatial distance to the main event (considered as a point process). The spatial 
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regularization distance d accounts for the finite rupture size. The power law kernel in space with 
exponent jj, (this exponent should not be confused with that of the distribution of the stress fluctuations 
defined in (12)) quantifies the fact that the distribution of distances between pairs of events is well 
described by a power-law [Kagan and Jackson, 1998]. In addition, the magnitude of these primary 
aftershocks is assumed to be distributed according to the Gutenberg-Richter law of parameter b. The 
ETAS model assumes that each primary aftershock may trigger its own aftershocks (secondary events) 
according to the same law, the secondary aftershocks themselves may trigger tertiary aftershocks and so 
on, creating a cascade process. The exponent 1 + 9 is not the observable Omori exponent p but defines 
the local (or direct) Omori law [Sornette and Sornette, 1999; Helmstter and Somette, 2002a]. The two 
exponents b and 9 should not be confused with those used for the Green function of viscous relaxation. 

Using the ETAS code, we thus generated a catalog of earthquakes located within a three-dimensional 
slab of horizontal dimension 500 x 500 km^ and thickness 20 km. We added a noise with amplitude of 5 
km to the position of each event to simulate the location uncertainty of real catalogs. The parameters 
of the ETAS model were chosen as follows: b = 0.9, 9 = 1.1, c = 10"^ day, K = 0.002, and a = 5 = 0.9. 
The characteristic spatial distance d in the ETAS kernel was taken equal to the event rupture length 
(which we deduced from its magnitude, according to the Wells and Coppersmith (1994) relationship 
(41)), while the spatial decay exponent /i was fixed to 1.0. The miniinuni magnitude of generated events 
was set to Mq = 0.5, and we introduced a truncation of the Gutenberg-Richter distribution such that no 
event of magnitude Mi > 8.0 are allowed. The rate of background events was set to 50 events/day. The 
obtained synthetic catalog is 25 years long, and only the 10 last years are retained to minimize temporal 
edge effects at the beginning of the time-series. Finally, only events of magnitude larger than 1.5 were 
kept in the catalog, to mimic the effect of a magnitude detection threshold of real seismic networks. 
We performed several simulations until we generated a catalog similar to the post- 1994 sub-catalog we 
previously analyzed: by similar, we mean that the synthetic catalog has approximately the same total 
number of events and the same number of events with magnitude Mi > 7.0. With these parameters, 
the branching ratio (mean number of triggered events per shock) is n = 0.983 and the characteristic 
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time t* below which the bare Omori law ^ is renormalized into the observable global Omori law 

~ l/t^~^ is t* = 1.1 • 10^'' years. Thus, the p-value of any aftershock sequence is by construction equal 
to p = 1 — ^ = 0.9 in the time interval [10~'^year; lyear] that we used for our analysis, which is the same 
time scale used for our analysis of aftershock sequences of real catalogs. 

The selection of mainshocks and aftershocks in our synthetic catalog was performed using the 
second declustering algorithm (with R = 2L) described in section 5.2.3, and we used the same stacking 
method to derive empirically the p{Mi) relationship. The results obtained with the fits of the logarithm 
of the stacked seismic rates as a function of the logarithm of time are shown in Figure 16. The p- values 
are found independent of the magnitude of the mainshock and close to the correct value. There is 
actually a tendency for the p-value to decrease with the mainshock magnitude (which is the opposite 
of the effect predicted by our model and reported for the real catalogs). The origin of this effect is the 
following: for mainshocks of small magnitudes, there are so many stacked time-series that fluctuations 
average out allowing to retrieve a precise p-value; in constrast, after mainshocks of large magnitudes, 
strong secondary aftershock sequences often occur which introduce large fluctuations. As there are only 
a few stacked aftershock series associated with the relatively rare large mainshocks, the fluctuations of 
the average seismic rates after the large mainshocks do not average out; bursts of strong aftershocks 
tend to bias the p downward leading to its under-estimation. To minimize this effect for events of large 
magnitudes, we fltted the Omori law over the beginning part of the time series and removed the end of 
the time-series to compute p (see Figure 16). 

We also inverted p on the same synthetic data using the maximum likelihood formula (44) in a 
finite time window from t to tu, following the same method as for the real catalogs. We span t up to 
tc//2. Figure 17 plots the p-value in diflferent mainshock magnitude ranges as a function of t. Since 
the MLE method is sensitive to the background seismicity, using the lessons from our previous analysis 
on the real catalogs, we restricted tu to avoid biases due to the background seismicity. Thus, for 
magnitudes up to 3, we did not take into account data for times larger than 0.01 year. This upper 
cut-off has been set to 0.1 year for the 3.5 — 4 magnitude range, and no cut-off (that is, = 1 year) 
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was imposed for larger magnitude ranges since their larger triggered seismicity makes the background 
negligible up to one year after the mainshocks. The results are not sensitive to these specific cut-offs 
as long as they correspond to negligible background seismicities. Figure 17 clearly shows that the 
p-values cluster around p = 0.9 whatever the magnitude range. Oscillations occurring at large times 
correlate with oscillations observed on the binned data, and betray the influence of secondary aftershock 
sequences. 

These results and the accuracy of the recovered p-values in our synthetic catalog strengthen one's 
confidence in our reported results that the magnitude dependence of the p-value in the real Southern 
California catalogs is a genuine effect and not an artifact of our data analyzing procedure. 

6. Discussion 

6.1. Summary 

We have proposed a new physically-based "multifractal stress activation" model of earthquake 
interaction and triggering based on two simple ingredients: (i) seismic ruptures result from activated 
processes giving an exponential dependence of the seismic rate on the local stress; (ii) the stress 
relaxation has a long memory, typically larger than one year. The combination of these two effects 
gives rise in a rather general way to seismic decay rates following mainshocks that can be well-described 
by apparent Omori laws with exponents p which are linearly increasing with the magnitude Ml of 
the mainshock. This p{Mi) dependence can be interpreted as a temporal multifractality, that is, as a 
continuous spectrum of exponents, each exponent being associated with a given singularity strength 
(mainshock magnitude). 

While rather general, these predictions require, within our model, that the two exponents 9 (stress 
relaxation) and /x (stress strength distribution) verify approximately the condition /x(l + 9) = 1. We 
stress that the special case = 2,9 = —1/2), which has been shown to give an exact multifractal 
process, obeys this condition. Since 9 and /x are two inputs, our theory is mute on the possible origin 
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of this condition. The fact that such a constraint appears as the condition to observe an Omori power 
law is a very interesting prediction to test in the future. As a bonus, this condition predicts the 
multifractaUty expressed by a dependence of the Omori exponent on the earthquake magnitude. These 
results can also be seen to provide a generalization to the multifractal random walk {^, = 2,9 = —1/2) 
of [Muzy and Bacry, 2002; Schmitt and Marsan, 2001], by showing that multifractality is not an isolated 
property at a single point in the plane {iJ,,6) but occurs over a line of co-dimension 1. At this stage, 
we can only conjecture that a broader model embodying the self-organization of the stress field and 
earthquake space-time organization will lead to the prediction that indeed the two exponent fi and 9 
are not independent but are linked by the condition + ^) = 1. 

These predictions have been tested by a careful and detailed analysis of earthquake sequences in 
the Southern California Earthquake catalog. The robustness of the results obtained with respect to 
different time intervals, magnitude ranges, declustering methods suggests that we have discovered a 
new important fact of seismicity: the apparent power law relaxation of seismic sequences triggered 
by mainshocks has indeed an exponent p increasing with the mainshock magnitude by approximately 
0.1 — 0.15 unit for each magnitude unit increase. The fits (42) and (43) of the data are in agreement 
with the theoretical prediction (37) of the proposed multifractal stress activation model. 

6.2. Intuitive "proof" that p{Ml) increeises with Ml for any multifractal generalization of 
Omori's law 

Here, we give an heuristic and intuitive reason why, if p varies with Ml, this can only be by 
increasing with the mainshock magnitude. 

Consider the plate tectonic process which continuously produces earthquakes, which trigger other 
earthquakes and so on, with a productivity increasing with the earthquake magnitude. Let us study 
the temporel evolution in a fixed spatial domain. This temporal evolution can be viewed to define 
a statistically stationary measure defined on the temporal axis, the measure determining the rate of 
earthquakes at any possible instant. A general method for quantifying such measure is to calculate its 
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multifractal spectrum. For instance, if the seismic rate is constant, the measure is uniform and the 
multifractal spectrum is reduced to a point of dimension /(a) = 1 at the singularity strength a = 1. 
Note that this notation a refers in the present discussion to the exponent of a local singularity and not 
to the exponent of the productivity law. 

An Omori sequence with exponent p corresponds to a singularity (to the right) equal to 1 — p (for 
p ^ 1). Let us calculate the singularity spectrum of the seismic rate measure. This is usually done via 
the calculation of moments of order q, large positive q^s corresponding to small a's, that is, to strong 
singularities (see for instance Chapter 5.2 in [Sornette, 2004]). Now, a large earthquake triggers a strong 
burst of seismicity, giving rise to a strong singularity. From the relation a = 1 — p, to be consistent with 
the multifractal description, a large earthquake must be associated with a strong singularity, a small 
a, hence a large p. Reciprocally, small moment orders q select weak seismic sequences, which are thus 
associated with small local mainshocks. Small g's are associated with large a's, hence small p's. Thus, 
any generalization that allows for a dependence of p on the mainshock magnitude necessarily leads to 
an exponent p increasing with the magnitude. 

By a similar argument in the space domain, the exponent of the spatial decay of the seismic rate 
induced by a mainshock of magnitude Ml should increase with Ml- Thus, in this view, the ETAS 
model is nothing but the mono-fractal approximation of the more general multifractal description of 
seismicity. 

6.3. Self-similarity of earthquakes 

A central empirical observation in seismology is the unability to discriminate between rupture 
processes of large and small earthquakes. The signature of this self-similarity of individual seismic 
events translates into the invariance of the stress drop (inverted from seismic waves) with rupture size. 
It would thus seem that Figure 9 for instance is contradicting this empirical law since the statistics of 
the time-series of triggered events depend on the mainshock magnitude. This could be interpreted as 
reflecting different rupture mechanisms, thus breaking scale invariance. However, such an interpretation 
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is incorrect. First, it is now well-understood that mono-fractality embodied in a universal (Omori) 
power law is not the only signature of scale invariance. Since Parisi and Frisch [1985] and Halsey et al. 
[1986], several complex systems have been shown to exhibit an extended form of scale invariance, called 
multifractality. The p{M) dependence documented here is an example of such multifractality. Second, 
the main ingredients of our theoretical model are scale invariant; (i) the triggering mechanism of a 
single shock is stress activation, independently of the magnitude of the event to be nucleated; (ii) we 
make no assumption the stress drop during an event; (iii) the Cauchy law (11) (exponent /U = 1 in (12)) 
was suggested by Kagan [1994] using the self-similarity argument, and our theoretical results show that 
even in that case p depends on M. Once again, the long memory of the stress relaxation kernel coupled 
with the exponential stress activation are sufficient to predict such a magnitude dependence, which is 
thus compatible with the self-similarity hypothesis. 

6.4. Alternative models 

Could these observations be interpreted differently than with the multifractal stress activation 
model? For instance, a change of the apparent Omori exponent p is consistent with the ETAS model if 
the critical branching ratio n (average number of triggered events per mainshock) is less than unity: if 
n is close to its critical value 1, p is close to 1 — ^, while for n < 1, p goes from 1 — ^ at short time scale 
to 1 + ^ beyond a characteristic time scale t* ~ (1 — n)~^/^ [Somette and Somette, 1999; Helmstetter 
and Somette, 2002a]. The parameter 9 is often found in the range 0.2 — 0.3 for large shocks. Thus, to 
explain our results (42) and (43), one would need to invoke a larger value for 6 of the order of 0.5 and 
to have n depend on mainshock magnitudes. Or if the measurement for small mainshock magnitudes 
are performed at shorter times than for larger mainshocks, this would lead to an increase of the }>value 
because short times should be controlled by the exponent 1 — 6 while longer time reveal the exponent 
l-\-6. However, this last explanation is ruled out by the fact that our determination of the p- values is 
performed on the same time interval for all magnitudes. 

One could also perhaps argue that small earthquakes reveal the critical state of the crust 
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corresponding to n close to 1 while large earthquakes move the crust away from criticality so that 
their triggered aftershock sequences correspond to a small branching ratio n < 1, hence the larger 
exponent. This picture has been recently advocated under the concept of "intermittent criticality" 
(see for instance [Jaume and Sykes, 1999; Goltz and Bose, 2002; Ben-Zion et al., 2003; Bowman 
and Sammis, 2004]). While we cannot exclude that intermittent criticality is the explanation for our 
results, this interpretation has only qualitative predictive power and needs more fine-tuning than the 
multifractal stress activation model since the later predicts precisely the observed linear dependence 
of p{M). For the ETAS model to explain our results, we would need a magnitude dependence of the 
productivity parameter a which does not seem to be observed [Helmstetter, 2003] (note however that 
Helmstetter's measure of a are probably underestimated as they have relied on the assumption of a 
constant p- value and have used a fixed space domain size independent of mainshock magnitude to select 
them). In addition, the multifractal stress activation model is physically-based while the ETAS model 
is only phenomenological and less attractive as an "explanation." 

6.5. Other predictions of the multifractal stress activation model 

6.5.1. Temperature dependence of p-values 

Another prediction of the multifractal stress activation model is the linear dependence of the 
j3-value oc /3 in (37) as a function of the inverse of the temperature. This implies that the strong 
dependence of the p-value as a function of mainshock magnitude should be more visible with a larger 
amplitude for cold regions. This suggests a tantalizing new interpretation of the correlations reported 
between the j?- value and thermal flux [Kisslinger and Jones, 1991]. These authors observed a positive 
correlation, superficially in line with the intuitive idea that hotter material relaxes stress at larger 
rates. We must point out, however, that they did not take account of uncertainties on measured heat 
flow, and that such a correlation was obtained by inverting p on a few dozens of individual aftershock 
sequences following mainshocks of different magnitudes (from 5.1 to 7.5). Using a world-wide catalog 
of events of magnitudes larger then 5, within which they selected nine subregions, Marsan and Bean 
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[2003] also observed a positive correlation between p and heat-flow. The exponent p was computed 
through the estimation of the power-spectrum of the time-series of events. Their figure 10 shows error 
bars on both p and heat flow. Despite the rather strong apparent correlation, we would like to stress 
that such results are difficult to interpret because the data points are concentrated around average 
heat-flow values, and there are only a few data for very large or very small heat-flow values. Thus, 
the dispersion of data is highly inhomogeneous around the mean and such a biased sampling in heat 
flow may yield misleading results. Here, again we propose to reconsider such an analysis with a more 
uniform sampling in heat flow and according to the magnitude of main events. Indeed, since our model 
predicts that the magnitude and temperature effects are entangled, a careful analysis of the magnitude 
relationship p{M) is needed before testing for a temperature or heat-flow effect. 

We predict a negative correlation between p and temperature which seems at first rather puzzling, 
as it predicts that the seismicity rate will relax more slowly at larger temperatures (every other 
parameter being kept equal). This paradox is resolved by distinguishing between absolute seismicity 
rate and relative decay rates. Indeed, expressions (1) and (2) show that the seismic rate can be 
decomposed into the product of two exponential terms 

A(f,t)~e-'5^«« e+'3^^(^~*) . (47) 

The first exponential e~^^°^^ controls the absolute seismicity level and exhibits the usual effect that, 
the higher the temperature kT = 1/(3, the larger is the seismicity rate. The second exponential 
g+0vi:{r,t) gives the dependence of the seismicity rate as a function of time due to stress interactions, 
as described in (3) and following equations. It exhibits an inverse temperature effect: the larger the 
temperature, the smaller it is; together with the dependence T,{f,t) oc ln{T/t), it is responsible for 
the paradoxical temperature dependence of the p-value. In summary, the larger the temperature, the 
larger is the absolute seismicity level but the smaller is the p-value of Omori's law. More generally, 
distinguishing between absolute and relative levels is essential when dealing with scale-invariant laws. 
The same paradoxical effect occurs for instance with fractal dimensions: a large fractal dimension 
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does not necessarily imply a large density, since the density is the product of an absolute term setting 
the units and a scale-dependent part function of the fractal dimension: the fractal dimension only 
describes the relative values of densities at two different scales, not their absolute values. Confusing 
these two contributions to the density is often done in the geophysical literature in which the dimension 
is incorrectly considered as a first-order measure of density; in reality, it is only a relative measure 
comparing densities at different scales. To come back to Kisslinger and Jones [1991] 's observations, 
rather than large ^>-values associated with large heat flows, we suggest a more complicated dependence 
involving a higher absolute seismicity level and a weaker dependence of p on the mainshock magnitude 
as the temperature (heat flow) increases. Everything being kept equal, as the temperature increases, 
the p{M) values are predicted to decrease. This may perhaps also explain the careful laboratory 
observations of Carreker [1950] who showed for Pt (platinium) that the strain rate exponent in creep 
experiments in the primary regime decreases with temperature. Some other complications can occur 
if the integral time scale T also depends on the temperature, but if it is very large as we expect (at 
least a few years), its variation with the temperature 1//3 will not have significant consequences for the 
seismic aftershock decay rates studied here. 

The decrease of p with increasing temperature has been also documented in a numerical simulation 
of the sandpile model of Cliristcnscn and Olami [1992] in which elements break by static fatigue 
according to the rate (1) with (2). It was found [Helmstetter, 2002] that this decrease oi p with T is 
the opposite of the prediction of the standard thermal activation model when neglecting interactions 
between rupture and thus results fundamentally from multiple interactions between events. 

6.5.2. Magnitude-dependence of the p-values in other works 

The increase of the p-value with the mainshock magnitude implies that aftershock sequences 
of large events decay at a faster rate than aftershock sequences of small events. Yet, they have a 
much larger number of events and can thus be observed in general over longer times. According to 
our analysis, mainshocks of magnitudes going from 5 and 7.5 (for which the Omori law is generally 
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inverted), the average p- value increases from 0.9 to 1.2. These are the typical values generally given in 
the literature. Our detection of a systematic increase of p(Ml) was only made clear by extending the 
range of magnitudes over which the Omori law is tested. 

There has been only few attempts to try to correlate values of individual aftershock time-series 
with the magnitude of the mainshock (see [Kisslinger and Jones, 1991], for example, in which no 
correlation was found which is not surprising as our analysis shows that one has to stack many sequences 
following similar magnitude shocks in order to average fluctuations out). The only other work we are 
aware, which uses stacked sequences, is the one of Bohnenstiehl et al. [2003]. These authors studied 
the time-clustering of events of magnitude M > 3 along the Mid- Atlantic Ridge, using catalogs derived 
from the detection of T- waves. Each event is quantified using a source level {SL), expressed in dB 
units, which is a logarithmic measure of its size. The earthquake catalog is then represented as a series 
of point process events located at each earthquake's occurrence time, whose power spectrum can be 
computed and fitted with a power law of exponent aps (they indeed use an Allan factor analysis to 
determine aps). Tuning the detection threshold SLq of events, they note a tendency for aps to decrease 
as SLq increases. Since the p- value is related to aps by p = 1 — aps , then their observation confirms 
that p decreases with M (since, due to the Gutenberg-Richter law, their power-spectrum is dominated 
by the statistics related to the smallest magnitude events in the catalogue). 

Mines offer meso-scale crustal laboratories to study superficial earthquakes. The sizes of events are 
most often of magnitude 3 and lower. As stated above, such events trigger a few aftershocks so that 
the recovery of an Omori law requires stacking many events. Marsan ct al. [1999] studied the time 
clustering of events in the Creighton mine (Ontario, Canada), and built a stacked time-series which is 
equivalent to ours if we consider that the size of the aftershock area after each event is the size of the 
mine itself. Despite the fact that they did not remove shocks already tagged as aftershocks from the 
main events list (which is a minor departure from our procedure considering the small size of all events), 
they measured p = 0.4. They do not mention the magnitude range of events they used in their paper, 
but such a p-value suggests (from our observations) that the majority of their shocks would roughly be 
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of magnitude 1 — 3, which is a very reasonable prediction for such events. According to our model, we 
should also take into account the difference of temperature to get a reliable quantitative comparison, 
but the low p-value found by Marsan et al. [1999] is already in qualitative agreement with our model. 

6.5.3. Dependence of the p-value on the shear and normal stress components 

The multifractal stress activation model may rationalize the empirical observations of a dependence 
of the p- value on the shear and normal stress components, which suggests the relevance of fluid pressure 
[Scholz, 2002]. These properties which cannot be accounted for by the Dieterich model [Dieterich, 1994] 
can naturally arise from the impact of fluid pression on the stress redistribution. 

6.5.4. Distribution of seismic rates and of stress source strengths 

Our multifractal stress activation model can be further falsified by comparing its prediction of the 
distribution of seismic rates Pr(A), once the distribution (12) of stress source strengths and the stress 
relaxation memory kernel (13) are specified. As Pr(A) seems to be a power law with exponent n « 1.5 
[Work in progress], this seems to imply a truncation of the distribution of the stress source strengths. 
More work is required to clarify this issue and will be reported elsewhere. 

6.5.5. Implications for prediction 

If true, our discovered multifractal Omori law has probably many important implications for 
understanding earthquake patterns and for prediction, that need to be investigated in details. The 
interplay between magnitude and decay rate found here leads to new interpretations of spatio-temporal 
patterns of seismicity. In particular, this will shed light on the underlying basis of various pattern 
recognition techniques that tend to sort earthquakes in terms of their magnitudes: according to our 
theory, different magnitude classes which are controlled by the same underlying physics give rise to 
distinct triggering signatures. 
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6.5.6. Lower bounds for a minimum earthquake magnitude 

The reported dependences (42) and (43) of p{Ml) suggests the existence of a lower bound Mmin for 
the minimum magnitude of earthquake able to trigger other events. Indeed, from the condition p > 0, 
we obtain M^in > —2.3 using (42) and Mmin > —3.7 (43). The real uncertainty is difficult to estimate 
and is probably of the order of the difference between these two values. Since there is no way we can 
address all known and unknown systematic error terms and uncertainties, the best way to estimate the 
uncertainty is by comparison of two different procedures with several different implementation, as we 
have done. This leads to the estimate Mmin > — 3 ± 1. Note that the existence of Mmin does not imply 
that there are no earthquakes of smaller magnitudes, only that those smaller events do not play a role 
in the triggering process (they do not trigger other events). 
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Figure 1. Numerical evaluation of the normalized 7(t)/7(x — c/T) (linear scale) defined by (31) as a 
function of t = t/T (logarithmic scale), for several pairs (/i, 6*) which obey the condition /i(l + 9) = 1 
exactly, using c/T ~ 10^^. We verify the existence of a linear decay of j{t) in the variable ln<, which 
qualifies an Omori power law (36). Since the logarithm of the seismicity rate X(t) is proportional to 
7 (with a coefficient of proportionality involving the inverse temperature and the magnitude of the 
mainshock), a linear behavior qualifies a power law decay of the seismic rate, whose range is rather 
sensitive to the temperature and the ratio c/T. 
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Figure 2. Same as Fig. 1 for several values of at fixed /i = 2, to show the sensitivity of the existence 
of a linear behavior of ^{t) as a function of Int for values of {fi,9) which depart from the condition 
fi{l + 9) — 1. Depending on the temperature and mainshock magnitude, a power law regime for the 
seismic rate can be observed approximately over several decades even for values of (fi, 9) which depart 
from the condition fi{l + 9) = 1. 
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Time evolution of the cumulative magnitude distribution 




Time (years) Magnitude 
Figure 3. Time evolution of the yearly cumulative magnitude distribution from 1932 to 2003 included, 
obtained from the Southern Californian earthquakes catalogue with revised magnitudes (available at the 
Southern California Earthquake Center). Magnitudes are given with a 0.1 resolution from 1932 to 
present, in a zone roughly comprised within 32° to 37°N in latitude, and within —114° to —122° in 
longitude. 
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Cumulative magnitude distribution for various years 




4 5 6 7 8 

IVIagnitude 

Figure 4. Cumulative magnitude distribution (CMD) for different time spans of the SCEC catalogue 
used to define the approximate lowest magnitude Ml of completeness. The CMD is approximately linear 
for Ml > 3 for the whole lifespan of the catalog, while it is linear for all magnitudes larger than 2.5 
(respectively 2 and 1.5) for shocks after 1975 (respectively after 1992 and 1994). 
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Omori laws for shocks occurring after 1932 - 1st declustering technique 
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Figure 5. Seismic decay rates of stacked sequences for several magnitude intervals of the mainshocks, 
for the period from 1932 to 2003 when using the first declustering technique. 
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Omori laws for shocks occurring after 1932 - 2nd declustering technique 
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Figure 6. Seismic decay rates of stacked sequences for several magnitude intervals of the mainshocks, 
for the period from 1932 to 2003 when using the second declustering technique. 
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Omori laws for shocks occurring after 1994 - 1st declustering technique 
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Figure 7. Seismic decay rates of stacked sequences for several magnitude intervals of the mainshocks, 
for the period from 1994 to 2003 when using the first declustering technique. 
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Omori laws for shocks occurring after 1994 - 2nd declustering technique 
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Figure 8. Seismic decay rates of stacked sequences for several magnitude intervals of the mainshocks, 
for the period from 1994 to 2003 when using the second declustering technique. 
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p-value as a function of magnitude for the various sub-catalogs 
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Figure 9. p- values of the Oinori law (40) obtained by the procedure described in the text for mainshocks 
(defined using the first declustering algorithm) as a function of the main events' magnitude, for the 
different sub-catalogs of lifespans given in the inset. 
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Linear fit of p(M) 
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Figure 10. Average p-values and error bars obtained from Figure 9 as described in the text. The 
straight line is the hnear fit with p{M) = 0.12Ml + 0.28. 
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Figure 12. Maximum Likeliliood Estimation of tlie p-value (formula (44) witli tjj = 1 year, wliile t was 
varied continuously from 10^"* to 0.5 year) as a function of t for the post-1932 sub-catalog, using the 
first declustering technique, with R = 2L, for different magnitude ranges. 
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Figure 13. Same as Figure 12 for tlie post-1932 sub-catalog, using the second declustering technique, 
with R = 2L, for different magnitude ranges. 
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Figure 15. Same as Figure 12 for the post-1994 sub-catalog, using the second declustering technique, 
with R = 2L, for different magnitude ranges. 
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Omori laws for ETAS catalog 
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Figure 16. Same as Figure 6 for the syntlietic catalog generated with the 3D ETAS model, using the 
second declustering technique, with R = 2L, for different magnitude ranges. 
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Figure 17. Same as Figure 12 for the synthetic catalog generated with the 3D ETAS model. The MLE 
formula (44) is applied in a finite time window fi'om t to tjj, following the same method as for the real 
catalogs. The upper value tu is 0.01 year for magnitudes up to 3, 0.1 year for the 3.5 — 4 magnitude 
range, and 1 year for larger magnitude ranges, so as to minimize bias due to the background seismicity. 



